Word Sense Disambiguation Based on Structured Semantic Space

نویسندگان

  • Dong-Hong Ji
  • Changning Huang
چکیده

In this paper, we propose a framework, structured semantic space, as a foundation for word sense disarnbiguation tasks, and present a strategy to identify the correct sense of a word in some context based on the space. The semantic space is a set of multidimensional real-valued vectors, which formally describe the contexts of words. Instead of locating all word senses in the space, we only make use of mono-sense words to outline it. We design a merging procedure to establish the dendrogram structure of the space and give an heuristic algorithm to find the nodes (sense clusters) corresponding with sets of similar senses in the dendrogram. Given a word in a particular context' the context would activate some clusters in the dendrogram, based on its similarity with the contexts of the words in the clusters, then the correct sense of the word could be determined by comparing its definitions with those of the words in the clusters.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Latent Semantic Word Sense Induction and Disambiguation

In this paper, we present a unified model for the automatic induction of word senses from text, and the subsequent disambiguation of particular word instances using the automatically extracted sense inventory. The induction step and the disambiguation step are based on the same principle: words and contexts are mapped to a limited number of topical dimensions in a latent semantic word space. Th...

متن کامل

Dependency-Based Construction of Semantic Space Models

Traditionally, vector-based semantic space models use word co-occurrence counts from large corpora to represent lexical meaning. In this article we present a novel framework for constructing semantic spaces that takes syntactic relations into account. We introduce a formalization for this class of models, which allows linguistic knowledge to guide the construction process. We evaluate our frame...

متن کامل

Semantic Relatedness for Biomedical Word Sense Disambiguation

This paper presents a graph-based method for all-word word sense disambiguation of biomedical texts using semantic relatedness as edge weight. Semantic relatedness is derived from a term-topic co-occurrence matrix. The sense inventory is generated by the MetaMap program. Word sense disambiguation is performed on a disambiguation graph via a vertex centrality measure. The proposed method achieve...

متن کامل

Ontology Based Query Expansion Using Word Sense Disambiguation

The existing information retrieval techniques do not consider the context of the keywords present in the user’s queries. Therefore, the search engines sometimes do not provide sufficient information to the users. New methods based on the semantics of user keywords must be developed to search in the vast web space without incurring loss of information. The semantic based information retrieval te...

متن کامل

AutoExtend: Combining Word Embeddings with Semantic Resources

We present AutoExtend, a system that combines word embeddings with semantic resources by learning embeddings for non-word objects like synsets and entities and learning word embeddings which incorporate the semantic information from the resource. The method is based on encoding and decoding the word embeddings and is flexible in that it can take any word embeddings as input and does not need an...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1997